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Abstract. Our previous results suggest that the use of Criterion, an automatic writing 
evaluation (AWE) system, is particularly successful in encouraging learners to produce 
amended drafts of their essays, and that those amended drafts generally represent an 
improvement on the original submission. Our analysis of the submitted essays and the 
feedback provided on the first drafts suggests, however, that the students use a variety of 
quite different strategies when using the automated computer-based feedback to produce 
amended drafts. These include simply accepting a suggested correction, interpreting a 
feedback comment to modify the text, and avoidance strategies such as leaving out text 
that was highlighted as incorrect or problematic. Our data suggest that the strategies the 
students use are at least partly influenced by the confidence they have in the feedback, 
and therefore in the system itself, but may also be influenced by their interpretation of 
how marks are awarded by the system. This presentation will discuss the findings of an 
in depth analysis of the changes made in second drafts submitted to the system, linking 
the changes to the automatic feedback provided on the first draft, and exploring the 
reasons for the changes made by the students. We will suggest ways in which teachers 
can explore the utility of various strategies with their learners. 
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1. Introduction 

One of the more difficult tasks that learners face is developing proficiency in writing, and 
it is generally assumed that timely and appropriate feedback is important in developing 
such proficiency (Black & Wiliam, 1998a, 1998b; Hyland & Hyland, 2006). There 
is, however, less agreement on how feedback can be most effectively targeted (on 
grammar, lexis or organisation/structure), on whether feedback should be explicit or 
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implicit, and on whether feedback is best provided by tutors, or peers, or a combination 
of the two. Research does not provide a clear answer to these questions, and teachers 
have developed a variety of pragmatic solutions, usually involving provision of at least 
some feedback themselves. However, this is inevitably time-consuming, especially if 
they attempt to provide feedback which is individualised, content-related, and timely, 
and if they encourage the production of multiple drafts (Grimes & Warschauer, 2010; 
Lee, Wong, Cheung, & Lee, 2009). 

Peer feedback is a widely used technique that can help to reduce the teacher’s 
workload by shifting the focus in feedback from just the teachers’ to both the teacher’s 
and the learners’ actions and opinions (e.g., Ferris, 2003). Research suggests, however, 
that learners see peer feedback as serving a different purpose from instructor feedback 
(Jacobs, Curtis, Brain, & Huang, 1998). Whatever approach is adopted, therefore, at 
least some teacher feedback is likely to be desirable and probably necessary, although 
this becomes increasingly difficult to provide as the number of students a teacher has 
to deal with increases. One possible solution is to exploit advances in technology such 
as computer applications which are claimed to be capable not only of assessing written 
work, but also of generating feedback for the learners - “intelligent CALL” which can 
interact with the material to be learned, including (providing) meaningful feedback and 
guidance (Warschauer & Healey, 1998). 

There is published research on the use of such applications for assessing writing 
(e.g., Rudner & Liang, 2002), comparing human scoring to computer scoring (e.g., 
Wang & Brown, 2007), and validating computerised scoring systems (e.g., Powers, 
Burstein, Chodorow, Fowles, & Kukich, 2001), and it is claimed that such applications 
match the reliability of human raters in assessing writing (e.g., Dikii, 2006). However, 
there is still relatively little research that has investigated the value of computer-based 
feedback (CBF) on students’ written work (e.g., Attali, 2004; Coniam, 2009), and much 
of it relates to LI, or English as an additional language (EAL) rather than English as a 
foreign language (EFL), writers of English. This paper reports the result of using one 
such automatic writing evaluation system - Criterion - with four different classes of 
EFL students in a variety of contexts over the last four years. 

2. Method 

2.1. Aims and participants 

This paper is based on the results of four studies that were conducted in Alexandria 
University, Egypt (N = 24), Hail University, Saudi Arabia (N = 23), and Newcastle 
University, UK (N = 11 and N= 15) between 2008 and 2012. The participants were 
all university students studying academic English, though learning English in order to 
study a variety of different subjects (the Alexandria students were training to be English 
teachers, for example, while the Newcastle students were planning to study a variety of 
other subjects at postgraduate level). 
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Each study had its own particular focus, but aims that were common across the studies 
were to investigate learners’ attitudes towards the computer-based feedback they were 
given, the nature of the feedback that Criterion provided, and what actions learners took 
as a result of the feedback. The aim of this paper is to use data from the four studies to 
investigate the ways in which AWE systems can be used, either on their own or together 
with teacher feedback, the actions of the learners once they received feedback, i.e., to 
study the changes made in second drafts submitted to the system, the content of the 
automatic feedback provided on the first draft which is linked to those changes, and, 
where possible, the reasons for the changes that learners made. 

2.2. Results 

There was a very high re-submission rate for the essays, i.e., almost all learners 
in the studies submitted a second revised draft for each title, using feedback 
provided by Criterion on what it categorises as grammar, usage, mechanics, style 
and organisation. The accuracy of the feedback in these categories varied in our 
studies with, for example, feedback on organisation and development tending to be 
rather unpredictable. Comments in this category which referred to missing “thesis 
statements”, for example, sometimes accurately highlighted a problem but at other 
times simply failed to correctly identify that the essay did contain such a statement. 
Feedback in other categories sometimes correctly identified a problem, but not 
necessarily the cause (a missing auxiliary was sometimes the reason for Criterion 
highlighting a verb and labelling it as “ill-formed”, for example). Criterion also had 
difficulty - not unexpectedly - in correctly identifying where the use of, or lack of, 
an article was a problem. 

Nevertheless, according to Criterion’s own marking system, the second draft 
submitted by a learner was almost always better, or at least at the same level as 
the first, and examination of some sample essays confirmed that this did indeed 
appear to be the case (in some of the studies there was some teacher correction of 
second drafts as well as computer feedback). One possible explanation for this is 
that Criterion managed to correctly identify sufficient surface-level errors that the 
learner was able to correct and produce a second draft that was at least better in 
terms of those features than the first draft. A second explanation is that the simple 
fact of receiving feedback on a first draft encouraged the learners to reflect not only 
on the highlighted problems, but on other aspects of their draft, before revising and 
resubmitting the essay. A third explanation is that even feedback that is ambiguous 
or inexplicit may, by encouraging reflection, lead a learner to find a correct, or more 
acceptable, alternative to a highlighted problem, suggesting that learners may be able 
to benefit from such feedback if they already have the required linguistic resources 
at their disposal. An example of the latter was a student who was observed reading 
a Criterion comment that referred to a “fragment, or a subject or verb missing”. In 
fact the highlighted problem was a verb in the present simple, which should have 
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been in the progressive. The learner was observed to consider the comment, and the 
highlighted problem, at some length, eventually correctly changing “begins” to “is 
beginning”. 

3. Discussion 

Our preliminary analysis of the results suggests the following: 

• Criterion proved useful in the variety of contexts in which it was tried, and 
especially in the situations where learners would normally be offered little, if 
any, teacher feedback; 

• In classes with more proficient learners, where regidar teacher feedback was 
expected, it was received positively, though with some reservations; 

• It seemed to be most useful for learners at or below intermediate or upper- 
intermediate level; 

• There was a high rate of submission of second drafts among all groups (for 
practical reasons learners were limited to submitting two drafts); 

• Where teacher feedback was also available, learners found the process of 
receiving automatic feedback on drafts useful in helping them produce an 
improved final draft which they hoped would be well received by the teacher; 

• The accuracy of the feedback provided by Criterion varied, as did its specificity, 
and its apparent value to the student. There is nevertheless evidence that the 
feedback encouraged the learners to reflect on their writing, to act on their 
reflections, and to produce improved drafts; 

• There is also some evidence that reflection on even ambiguous feedback could 
result in successful correction, perhaps of “mistakes” rather than, in error analysis 
terminology, “errors”. 

4. Conclusions 

There is still much to analyse in the data, but our tentative conclusion is that Criterion 
appears to be most suited to EFL learners at, or below, an intermediate or upper- 
intermediate level. It is especially effective at encouraging learners to reflect on their 
writing, and to produce second drafts. Given the nature of the feedback that Criterion 
provides, and the focus of the feedback, it is likely to be most useful when used in 
conjunction with teacher feedback. Work on the first two drafts can help learners 
eliminate some of the surface level errors, and encourage them to evaluate the structure 
and organisation of their writing, allowing the teacher more time to comment on content 
in subsequent drafts. 

There are a range of strategies that remain to be explored for combining computer 
and teacher feedback, including the possibility of integrating computer- and teacher- 
feedback with peer-feedback. In addition, although our studies were carried out as an 
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integral part of normal language courses, each lasted no more than a few weeks. We 
therefore have, as of yet, no data that would allow us to be confident that we have 
progressed beyond the possible influence of a novelty effect, and to investigate long 
term changes in attitudes towards computer-based feedback, and the long term effect 
on writing. 
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